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Abstract — In order to increase user confidence, many auto- 
mated theorem provers provide certificates that can be inde- 
pendentfy verified. In this paper, we report on our progress 
in developing a standalone tool for checking the correctness 
of certificates for the termination of term rewrite systems, and 
formalfy proving its correctness in the proof assistant Coq. To 
this end, we use the extraction mechanism of Coq and the library 
on rewriting theory and termination called CoLoR. 

L Introduction 

Being able to prove the correctness of a program is impor- 
tant, especially for critical applications (banking, aeronautics, 
etc). But this is generally undecidable. So, many different and 
complementary approaches have been developed for tackling 
this problem: software engineering methodologies, testing, 
model-checking, formal proof, etc. 

Instead of trying to prove that every possible output of a 
program is correct, one possible approach consists in making 
the tool provide, at each run, an evidence that its output is 
correct. This certificate can then be checked independently by 
another tool. Although it seems to only move the problem 
from one program to the other, the certificate verifier, there is 
in fact a gain in complexity. Typically, a program which goal 
is to find a solution to some numerical or symbolic problem, 
will use complex heuristics and optimizations, while checking 
that the solution found is indeed correct is often much easier. 
For instance, finding a boolean assignment satisfying some 
boolean formula (SAT problem) is (in the worst case) expo- 
nential in the number of boolean variables, while verifying the 
correctness of a given assignment (the certificate) is linear in 
the size of the formula. 

Since certificate verifiers are simpler programs, they are 
more easily amenable to a complete formalization and proof 
using some proof assistant tool. In fact, various such tools (e.g. 
Coq [I]) are themselves based on this two-level approach: 
they are composed of a small and hopefully safe kernel 
responsible of checking the correctness of proofs, and a proof 
development environment providing unsafe proof tactics and 
decision procedures for building step by step proofs that, in 
the end, have to be checked by the kernel to be included in 
the proof database. 

Termination, that is, the fact that a program eventually 
provides an output to the user, is an important property that 
is also undecidable [2|. Term rewriting 0, is a simple yet 



very general programming paradigm and framework, based 
on the notion of rewrite rule, that generalizes or in which 
to easily encode other programming paradigms like functional 
or logic programs. Examples of programming languages based 
on rewriting are 0, 0, Q, Ej. A few years ago, a formal 
language called CPF (9) has been developed that defines 
a notion of certificate for the termination of term rewrite 
systems. 

In this paper, we consider the problem of developing a 
standalone tool for checking the correctness of CPF certifi- 
cates, and formally proving its correctness. In [10 1, the first 
author describes a CPF verifier called Rainbow^ based on the 
following architecture: a compiler (written in OCaml [11]) 
from CPF to Gallina, the language of the Coq proof assistant 
(TJ, generates a Gallina script that is then checked by Coq 
itself using the Coq library CoLoR ifTol . This architecture has 
some advantages: it provides a way to automatically generate 
Coq representations of term rewrite systems and termination 
arguments that can be used for proving the termination of Coq 
functions. Indeed, in Coq, no function can be defined with- 
out proving its termination, because allowing non-terminating 
functions would make proof verification undecidable. But this 
architecture has also some disadvantages. First, compared to 
more standard programming languages, computation in Coq 
is very slow (and indeed too slow to check some complex 
termination certificates). Second, the compiler from CPF to 
Coq is not proved and can thus introduce errors not present 
in the certificate. 

Here, we consider a different architecture based on Coq's 
ability to generate OCaml (Til, Haskell H2 or Scheme £TJ] 
programs equivalent to the functions defined in it lfl4l . It 
consists in defining the CPF verification program directly 
in Coq (except the parsing part), and prove its correctness. 
Then, Coq's extraction mechanism provides us with an OCaml, 
Haskell or Scheme standalone program that can be compiled 
and efficiently executed independently of Coq or the CoLoR 
library. 

A similar approach has been undertaken successfully for the 
CPF verifier CeTA (TJ] with the proof assistant Isabelle/HOL 
1321, ifTTl . which implements classical higher-order logic with 
the axiom of choice fl8l . Here, we want to test this approach 

'http://color.inria.fr/rainbow.html 



in the proof assistant Coq, which implements an extension 
of intuitionist higher-order logic [19], [20], and by using the 
CoLoR library. 

The first problem to address is the representation in Coq of 
CPF certificates. The second one is the formalization and proof 
of the CPF verifier program using the Coq library on rewriting 
theory and termination called CoLoR ifTUI . In particular, it 
requires to translate the CPF data structures into the data 
structures used in CoLoR. 

This paper is organized as follows. In section UU we in- 
troduce term rewriting systems and give some examples of 
termination techniques used in current automated termination 
provers. In section [HI] we describe the formal language CPF 
for termination certificates used in the international competi- 
tion of automated termination provers |? 2T| . In section [IV] we 
introduce the proof assistant Coq and how to formalize and 
prove the correctness of a certificate verifier in it. In section 
[VJ we give some details on the representation of certificates 
in Coq. Finally, in section IVI1 we give some details on the 
formalization and proof of the verifier using the CoLoR library. 

II. Term rewrite systems and their termination 

We first recall what is rewriting: "rewrite systems are 
directed equations used to compute by repeatedly replacing 
sub terms of a given formula with equal terms until the simplest 
form possible is obtained" [3 |. More formally: 

Definition 1 (Term rewrite system) Let X be an infinite set 
of variables. Given a set T of function symbols (disjoint from 
X) and an arity function a : T — > N, the set T(J- ', X) of 
(first-order) terms over T and X is the smallest set containing 
X and such that, if f g T and ii, ... ,t a ^ are terms, then 

f(*i, ••-,*«(/)) is a term. 

A substitution a is a map from variables to terms that 
is extended to terms in the obvious way (xa = u(x) and 
f(t±, . . . ,t n )a — f(t±a,...,t n a)). A context C is a term 
with a unique occurrence of a distinguished variable [], which 
substitution by u is written C[u]. A (rewrite) rule is a pair 
of terms written I — > r. The rewrite relation —tn generated 
by a set 1Z of rules is the smallest relation containing 1Z and 
stable by substitution (t — u =>■ ta — >n ua) and context 
(t^ n u^ C[t] -^n C[u}). 

A relation — > terminates (or is well-founded, or noetherian) 
if there is no infinite sequence to — > t\ — > . . . 

A simple example of rewrite system is given by the addition 
on unary natural numbers: 

add(zero, x) — > x add(succ(ir), y) — > succ(add(x, y)) 

The termination of a TRS is undecidable in general, even 
with a single rule J2|. So, there has been active research 
for finding powerful sufficient conditions. An important one 
consists in interpreting function symbols by monotone poly- 
nomials on natural numbers N ll22ll . |23l : 

Theorem 2 (Polynomial interpretation) Let K be a TRS 

and (p be a function mapping a polynomial iff G 



Z[Xi, . . . , X n ] to each function symbol f of arity n. Given a 
valuation a : X N, let = a{x) and [f(ti, . . . , = 
92(f) ■ • ■ , \t n \a) be the interpretation of terms in Z 
induced by ip, and t ><p U if, for all a, [t]j£ >n the 
well-founded ordering on terms induced by <p. 

If every ipf is monotone in every xt, IZi C > v and — >n 2 
terminates, then — >iiiun^ terminates. 

For instance, the previous system can be proved terminating 
by using the following polynomial interpretation on N: 

Vadd(a;,y) = 2x + y tp succ (x) = x + 1 </j zero = 1 

Indeed, for the first rule, we have 2(1) + 2; >n X and, for the 
second rule, we have 2(x + 1) + y >n (2x + y) + 1, whatever 
are the values of x, y G N. 

Another very important method, at the basis of all current 
TRS termination provers, consists in transforming a TRS into 
a dependency pair (DP) problem [24|: 

Definition 3 (Dependency pair) Given a set of symbols J 7 , 
the set J 7 ' = T l±l {f " | f G J-} which consists of the disjoint 
union of J- with some copy of T, is the set of marked and un- 
marked symbols (f* is taken to be of same arity as f). Given a 
set 1Z of rules, a symbol f is said defined if there is a rule whose 
left hand-side is of the form f(l±, . . . ,l n ). Let T>(1Z) be the 
set of defined symbols. The set of dependency pairs DP(1Z) 
is then the set of marked rules f"(Zi, . . . , l n ) — > g"(ri, . . . , r p ) 
such that f (Zi, . . . , l n ) — > r G 1Z for some r, g(ri, . . . , r p ) 
is a subterm of r not occurring in some k, and g is defined. 
The dependency graph whose nodes are DP (1Z) has an edge 
between (Zi,ri) and (Za^a) if there are two substitutions a\ 
and (T2 such that r\G\ — ^ li^i- 

Indeed, — >iz terminates on T(J~, X) iff the composition 
of the reflexive-transitive closure of — >n with the closure by 
substitution of £)P(1Z), written — >DP("R.)/t> terminates on 
T(J r ", X). Intuitively, dependency pairs generalizes the notion 
of recursive calls and call graph in functional programming 
ll25ll . Interpretations in a well-founded domain are easily 
extended to deal with this more general kind of relations. 
Moreover, since we only consider the closure by substitution 
of DP (1Z), only one dependency pair need to strictly decrease 
in every cycle or, more simply, in every connected component 
of the dependency graph. This allows to split a DP problem 
into various independent DP sub-problems J26). 

For instance, in our simple example, there is only 
one dependency pair, add'(succ(a:), y) — > add"(x, y), the 
termination of which can be proved by taking p add t (x, y) = x. 

III. Termination certificates 

The theorem on polynomial interpretation can be described 
as a conditional deduction rule on termination problems: 

Mon(v?) TZ X C > v WF(^ Ka ) 
(rule-removal-PI) — — r 

WF(^ TClUTC2 ) 



where Mon(<^) means that every ipf is monotone in every xi, 
TZ-i C > ip that every rule of 1Z\ is strictly decreasing in the 
interpretation, and WF(— 5>-r, 2 ) that — >7j 2 terminates (is well- 
founded). 

Similar conditional deduction rules can be written for most 
if not all termination methods used in current termination 
pro vers ll27l . Hence, a termination proof can be described by 
a deduction tree obtained by composing deduction rules like 
(rule-removal-PI) and axioms like: 

ft = 
(6mpty) WF^) 

For the international competition of automated termination 
pro vers ll2D . a formal language called CPF [9] has been 
collectively defined for representing such deduction trees. It 
is given as an XML Schema or XSD file EH), (29). An XSD 
file is like a grammar: it describes the set of XML files that are 
admissible. XML is a well established W3C text file standard 
[30 1 for describing tree-structured data. For instance, in CPF, 
a rewrite rule has to be described by the following XML text: 

<rule><lhs>. . .</lhs><rhs>. . . </rhs></rule> 

It represents a labeled tree, which root is labeled by the 
tag rule, having two sub-trees: the first one describes the rule 
left hand-side and has its root labeled by the tag lhs, and the 
second one describes the rule right hand-side and its root is 
labeled by the tag rhs. The XML Schema language (which 
is a subset of XML) allows to describe some set of valid 
XML texts by declaring what are the possible labeled trees. 
For instance, the XSD type used in CPF for rewrite rules is: 

<xs: element name="rule"> 
<xs : complexType> 
<xs : sequence> 

<xs: element name="lhs"> 
<xs : complexType> 

<xs: group ref="term"/> 
</xs : complexType> 
</xs : element> 
<xs:element name="rhs"> 
<xs : complexType> 

<xs:group ref="term"/> 
</xs : complexType> 
</xs : element> 
</xs : sequence> 
</xs : complexType> 
</xs : element> 

The main type constructors allowed in XSD are, informally: 
« element: if T is an XSD type and i is a string, 

then <element name="a;">T</element> denotes the set 

of trees which root is labeled by x and which children 
belong to the set of trees corresponding to T. 
• sequence: if T\, . . . ,T n are XSD types, then 

<sequence>Xi . . . T n </sequence> denotes^ the Set 

of tuples of trees (t\, . . . , t n ) such that t\ is of type T\, 
t n is of type T n . 

2 In the complete definition, every type Ti can be equipped with two 
attributes a £ N and b £ N U {00} specifying the minimum and maximum 
numbers (00 meaning arbitrary) of children of type Tj. 



• choice: if Ti, . . . ,T n are XSD types, then 
<choice>Ti . . . T„</choice> denotes the union of 
the sets of trees corresponding to T%, . . . , T n . 

IV. Formalization and proof of a certificate 

VERIFIER IN COQ 

The Coq proof assistant [ 1 1 is a tool that allows one to 
formally define mathematical objects and prove statements 
about them. It has been successfully used in the certification 
of various important applications, either industrial: a JavaCard 
platform ll3~Tl or a C compiler ||32) . or academical: the four 
color theorem 11331 or Kepler's conjecture ll34l . 

It is based on an extension of Girard' system F ll35l and 
Martin-L6f type theory [36|, called the calculus of inductive 
constructions [19], ll20l . It allows function definitions by 
pattern-matching ll3~7l and provides a programmable proof 
tactic language 11381 . various decision procedures, and other 
important features like modules, type classes, etc. 

It is therefore possible to define in Coq an inductive data 
type cpf for representing CPF predicates, a boolean function 
check :trs->cpf->booi verifying the correctness of a certifi- 
cate wrt a termination problem, and formally prove that this 
function is correct, that is, in Coq syntax: 

Theorem check_is_correct : 

forall R x, check R x = true -> WF (red R) . 

Proof . ... Qed . 

In fact, in order to provide useful error messages if a 
certificate appears to be incorrect, to deal with certificates that 
the verifier does not know how to handle yet (there many 
different certificates in CPF and it is a really huge work to 
handle all of them), instead of a boolean output, we use an 
error monad 11391 . And since many auxiliary functions are 
necessary for translating CPF data structures into C0L0R data 
structures, we use a polymorphic error monad: 

Inductive result (A : Type) : Type := 
Ok : A -> result A 
Ko : error -> result A. 

Definition term : cpf_term -> result color_term := 

Theorem check_is_correct : 

forall R x, check R x = Ok unit -> WF (red R) . 

Finally, since Coq includes a typed A-calculus with induc- 
tive data types and pattern-matching, the extraction of ML-like 
function definitions [40| from Coq to OCaml [14| is almost 
straightforward^ and looks about the same since Coq syntax 
is very close to OCaml syntax. 

V. Parsing and Coq representation of certificates 

The CPF format is extended every year with new certificates 
and can be modified sometimes. In Rainbow, the data type 

3 Note however that parallel pattern-matching and pattern-matching with 
patterns of depth greater than 1 are not primitive in Coq. They are compiled 
into sequences of non-parallel pattern-matching with patterns of depth 1, 
leading to important code duplication in some cases. 

4 This is however not the case of more complex Coq constructions [14], 

ED. 



used for representing certificates internally and the parsing 
function used to create a value of this data type from a text 
file are written by hand (the parsing function uses the XML- 
Light library ll42l ). This is a possible source of errors and is 
time-consuming. 

To avoid these problems, we developed a compiler from 
XSD to Coq and OCaml that, from an XSD file, generates a 
Coq file (and hence an OCaml file after extraction from Coq) 
providing a data type definition for representing XML data 
valid wrt the given XSD file, and an OCaml file providing a 
parsing function for this data type (also based on XML-Light). 
This compiler is not intended to cover all aspects of XSD but 
only the one used in CPF. 

The XSD type constructors described above are translated 
to standard OCaml data structures as follows (with some 
optimizations): 

• sequence: tuple or list (an optional child being mapped 
to the OCaml option type); 

• choice: data type with a constructor for each case. 

For instance, in CPF, the type for function symbols is 
defined as follows: 

<xs : group name="symbol"> 
<xs : choice> 

<xs:element ref-"name " /> 
<xs: element name=" sharp "> 
<xs : complexType> 
<xs : sequence> 

<xs : group ref=" symbol " /> 
</xs : sequence) 
</xs : complexType> 
</xs : element) 

<xs : element name="labeledSymbol"> 
<xs : complexType> 
<xs : sequence) 

<xs : group ref=" symbol " /> 
<xs : group ref=" label " /> 
</xs : sequence) 
</xs : complexType) 
</xs : element) 
</xs : choice) 

where <group name="i"> is a way in XSD to introduce a 
type definition that can be referred to by x. This XSD type is 
translated by our compiler to the following inductive OCaml 
data type: 

type symbol = 

Symbol_name of name 
Symbol_sharp of symbol 

Symbol_labeledSymbol of symbol * label 

Other solutions could be chosen. Note however that not 
every OCaml value corresponds to an XML file validating 
CPF. To do so, we would need to use private data types [43 1 
or a stronger type system like the one of CDuce ||441 . ||451 . 

More importantly, in XSD, type definitions are unordered 
and a type definition can refer to types defined later in the file. 
This is not a problem in itself for OCaml or Coq since these 
languages support mutually defined types too. However, if CPF 
is represented in Coq as a single big set of mutually defined 
types, then Coq will generate a single big induction principle 
for all types that will be very difficult to use in proofs. It 
is therefore better to have as many minimal sets of mutually 



defined types as possible. And because in Coq and OCaml, 
the type names used in a type definition can only refer to 
type names of the same set of mutually defined types or to 
previously defined types, it is necessary to order the XSD type 
definitions wrt their dependencies: 

Definition 4 (Type dependency relation) For our purpose^, 
we can consider that a type T is defined by a finite set of 
constructors the arguments of which are of type Ti,. . .,T n 
respectively. Then, we say that a type T depends on a type U, 
written U <T, if there is a constructor of T having an argument 
of type U. And we say that a type U must be defined before 
a type T, written U ^ T, if (U, T) is in the reflexive and 
transitive closure of <d. We then denote by ~ the symmetric 
closure of ^ (it is an equivalence relation), and by -< = < — ~ 
its strict part. 

The minimal sets of mutually dependent types correspond 
then to the equivalence classes of the ~ equivalence relation, 
and these classes can be ordered topologically by using -(. 

VI. Definition and proof of a termination 

CERTIFICATE VERIFIER IN COQ 

The first problem to address is the translation of CPF data 
structures for symbols, terms, rules, polynomials, etc. to the 
corresponding CoLoR data structures. In fact, this is more or 
less straightforward except for terms. 

In CoLoR, every definition or theorem is parametrized by 
a given signature: 

Record Signature : Type := mkSignature { 
symbol :) Type; 
arity : symbol -> nat; 
beq_symb : symbol -> symbol -> bool; 
beq_symb_ok : 

forall x y, beq_symb x y = true <-> x = y } . 

providing the set of symbols, their arity and a boolean function 
on symbols ensuring that equality on symbols is decidable. 

Then, new sets are introduced when needed, like it is the 
case for marked symbols in the dependency pairs transforma- 
tion. Moreover, some termination techniques may change the 
arity of symbols. For instance, arguments filtering [24] may 
transform a TRS where f is of arity n > 1 into a TRS where 
f is of arity n — 1 by removing the first argument of f in every 
rule where f occurs. 

Hence, in CoLoR, the set of symbols and their arity may 
evolve dynamically during the verification of a certificate, and 
differently wrt the deduction branch followed (a certificate has 
a tree structure), while, in CPF, there is only one big type for 
all the possible symbols. Defining a function for converting 
a CPF term into a CoLoR term following the same dynamic 
would be complicated. 

Instead, we use the fact that the CPF type for symbols 
include all possible symbols that can be generated in the 
course of a verification, and chose the CPF type itself for 
the set of CoLoR symbols. Hence, only the arity function 

5 This is the class of OCaml types to which XSD types are compiled. 



needs to evolve dynamically. Note that this is correct to do 
so since signature extension reflects termination: given a set 
7Z of rules on T(T, X), if J- C Q, then — ^ terminates iff 
— ^ terminates, where — >^ is the relation generated by 1Z on 

r(A#) ma. 

As a consequence, we need to translate CoLoR data struc- 
tures for new symbols back into the cpf data type. To 
prove that this transformation reflects termination, we use 
the following theorem on signature morphisms formalized in 
CoLoR: 

Theorem 5 (Signature morphism) Let T and Q be two sets 
of symbols whose arity functions are a and f3 respectively, 
and let ip be a map from T to Q that respects arities, i.e. 
forall f e F, /3 y (f) = «f- The map ip then naturally extends 
to terms as follows: (p(x) — x and ip(f(ti, . . . ,t n )) = 

p(f)(v(*i)>" •>?(*»))■ 

If 1Z is a set of rules on T(J-, X) and — >u>cn) terminates 

on T(G, X), then terminates on T(T, X). 

Note that no property is required for cp other than to respect 
arities. In particular, it does not need to be injective. 

We now show how this applies on the DP transformation. 
Let T be the set of symbols corresponding to the data type 
symbol defined in the previous section. To simplify, we do not 
consider the constructor symbol_labeledSymbol. So, T can be 
seen as the solution of the equation X = j\f W {f® f e X}, 
where Af is the set of values of type name and @ stands 
for the constructor symboi_sharp to distinguish it from the 
symbol jj used in the DP transformation. Let 1Z be a set 
of rules on T with no symbol of the form f° such that 
-+TC*vh terminates, where V = ip{DP(K)) with ip(ft) = f® 
and = f otherwise. Then, by the theorem on signature 
morphisms, — >DP(n)h terminates and, by the DP theorem, 
— >n terminates. 

VII. Conclusion 

We started to develop a standalone tool for verifying the 
correctness of termination certificates for term rewrite sy stems 
[3 1 following the CPF format [9 1 used in the international com- 
petition of automated termination pro vers [21], and formally 
prove its correctness in the proof assistant Coq [flj using the 
Coq library on rewriting theory and termination CoLoR [TTOl 
and Coq extraction mechanism fl4l . 

We first developed a simple compiler for generating a Coq 
data type definition for representing XML Schema data types, 
and an XML parser for CPF. We also defined and proved in 
Coq a small verifier for two important termination techniques: 
dependency pairs ll24l and polynomial interpretations [23 1. But 
much more has to be done to be able to compete with the 
verifier CeTA developed in the proof assistant Isabelle/HOL 

m. 
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